63 results found.
Written
Treebank,
Language Type:
Multilingual
Languages:
Basque
Availability:
Freely Available
License:
N/A
Size:
15566 words Production Status:
Existing-used
Use:
Discourse
Paper:
N/A
Documentation:
Iruskieta, M.; Aranzabe, M.J.; Diaz de Ilarraza, A.; Gonzalez, I.; Lersundi, M.; Lopez de la Calle, O. 2013. The RST Basque TreeBank: an online search interface to check rhetorical relations. Paper presented at the 4th Workshop ''RST and Discourse Studies'', Brasil, October 21-23.Language Type:
Multilingual
Languages:
Basque Catalan Galician Portuguese Spanish
Availability:
From Owner
License:
Only for research, no distribution to third parties without explicit permission of owner
Size:
22 GByte Production Status:
Newly created-finished
Use:
Language Identification
Paper:
N/A
Documentation:
Albayzin 2012 LR Evaluation Plan + Interspeech Paper (brief description + summary of results)
Written
Evaluation Data,
Language Type:
Multilingual
Languages:
Basque Bulgarian Danish Dutch English Estonian German Hungarian Irish Italian Portuguese Russian Serbian Slovenian Spanish
Availability:
Freely Available
License:
Size:
3 MByte Production Status:
Newly created-in progress
Use:
Lexicon Creation/Annotation
-
Paper title:A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Sina Ahmadi | Monolingual Word Sense Alignment | /N |
Documentation:
None
Written
Dialogue dataset,
Language Type:
Monolingual
Languages:
Basque
Availability:
Freely Available
License:
Creative Commons Attribution-ShareAlike 4.0 International Public License (CC BY-SA 4.0)
Size:
1634 questions OtherProduction Status:
Newly created-finished
Use:
Dialogue
-
Paper title:Conversational Question Answering in Low Resource Scenarios: A Dataset and Case Study for Basque
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Arantxa Otegi | ElkarHizketak v1.0 | /N |
Documentation:
None
Written
Treebank,
Language Type:
Monolingual
Languages:
Afrikaans Akkadian Amharic Ancient Greek Arabic Armenian Assyrian Bambara Basque Belarusian Bhojpuri Breton Bulgarian Buryat Cantonese Catalan Chinese Classical Chinese Coptic Croatian Czech Danish Dutch English Erzya Estonian Faroese Finnish French Galician German Gothic Greek Hebrew Hindi Hindi English Hungarian Indonesian Irish Italian Japanese Karelian Kazakh Komi Permyak Komi Zyrian Korean Kurmanji Latin Latvian Lithuanian Livvi Maltese Marathi Mbya Guarani Moksha Naija North Sami Norwegian Old Church Slavonic Old French Old Russian Persian Polish Portuguese Romanian Russian Sanskrit Scottish Gaelic Serbian Skolt Sami Slovak Slovenian Spanish Swedish Swedish Sign Language Swiss German Tagalog Tamil Telugu Thai Turkish Ukrainian Upper Sorbian Urdu Uyghur Vietnamese Warlpiri Welsh Wolof Yoruba
Availability:
Freely Available
License:
Various
Size:
25 million words Production Status:
Existing-updated
Use:
Parsing and Tagging
-
Paper title:Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection
-
Paper track:Written/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Joakim Nivre | Universal Dependencies | /N |
Documentation:
https://universaldependencies.org
Written
Corpus,
Language Type:
Monolingual
Languages:
Afrikaans Albanian Arabic Armenian Bangla Basque Bosnian Breton Bulgarian Catalan Croatian Czech Danish Dutch English Esperanto Estonian Filipino Finnish French Galician Georgian German Greek Hebrew Hindi Hungarian Icelandic Indonesian Italian Japanese Kazakh Korean Latvian Lithuanian Macedonian Malay Malayalam Norwegian Persian Polish Portuguese Romanian Russian Serbian Sinhala Slovak Slovenian Spanish Swedish Tamil Telugu Thai Turkish Ukrainian Urdu Vietnamese pt_br ze_en ze_zh zh_cn zh_tw
Availability:
Freely Available
License:
<Not Specified>
Size:
22.10G tokens Production Status:
Existing-used
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:word2word: A Collection of Bilingual Lexicons for 3,564 Language Pairs
-
Paper track:Written/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yo Joong Choe | OpenSubtitles2018 | /N |
Documentation:
Yes, on the website.
Written
Lexicon,
Language Type:
Monolingual
Languages:
Afrikaans Albanian Arabic Armenian Bangla Basque Bosnian Breton Bulgarian Catalan Croatian Czech Danish Dutch English Esperanto Estonian Filipino Finnish French Galician Georgian German Greek Hebrew Hindi Hungarian Icelandic Indonesian Italian Japanese Kazakh Korean Latvian Lithuanian Macedonian Malay Malayalam Norwegian Persian Polish Portuguese Romanian Russian Serbian Sinhala Slovak Slovenian Spanish Swedish Tamil Telugu Thai Turkish Ukrainian Urdu Vietnamese pt_br ze_en ze_zh zh_cn zh_tw
Availability:
Freely Available
License:
CreativeCommons Attribution 4.0 International
Size:
41 GByte Production Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:word2word: A Collection of Bilingual Lexicons for 3,564 Language Pairs
-
Paper track:Written/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yo Joong Choe | word2word | /N |
Documentation:
Yes, on the website.
Written
Corpus,
Language Type:
Bilingual
Languages:
Basque German
Availability:
Freely Available
License:
Size:
None Production Status:
Newly created-in progress
Use:
Evaluation/Validation
-
Paper title:Linguistic Appropriateness and Pedagogic Usefulness of Reading Comprehension Questions
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Itziar Aldabe | German and Basque Questions | /N |
Documentation:
None
Multimodal/Multimedia
Corpus,
Language Type:
Monolingual
Languages:
Adyghe Albanian Ancient Greek Arabic Armenian Asturian Basque Belarusian Bulgarian Catalan Church Slavic Classic Syriac Classical Armenian Czech Danish Dutch English Estonian Faroese Finnish Georgian German Gothic Hindi Hungarian Icelandic Ingrian Irish Kabardian Kalaallisut Kannada Kazakh Khakas Latin Latvian Lithuanian Livonian languages Low German Lower Sorbian Macedonian Maltese Middle French Middle High German Middle Low German Modern Greek Neapolitan Northern Sami Occitan Old English Old French Old Irish Old Saxon Pashto Persian Polish Portuguese Romanian Slovenian Spanish Swedish Tibetan Turkish Turkmen Ukrainian Urdu Veps Votic Welsh
Availability:
Freely Available
License:
Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Size:
557.3 MByte Production Status:
Newly created-in progress
Use:
Morphological Analysis
-
Paper title:Wikinflection Corpus: A (Better) Multilingual, Morpheme-Annotated Inflectional Corpus
-
Paper track:Multimodality/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Eleni Metheniti | Wikinflection Corpus | /N |
Documentation:
https://github.com/lenakmeth/Wikinflection-Corpus/blob/master/README.md
Written
,
Language Type:
Monolingual
Languages:
Basque
Availability:
Freely Available
License:
Creative Commons
Size:
Pre-trained models for Basque: BERT, FastText, Flair MByte Production Status:
Newly created-finished
Use:
-
Paper title:Give your Text Representation Models some Love: the Case for Basque
-
Paper track:Written/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Rodrigo Agerri | Neural Representation Models for Basque | /N |
Documentation:
Documentation will be available in English




